Object Detection with YOLOv8: A Practical Application¶

This notebook demonstrates a complete object detection workflow using YOLOv8, one of the most practical and efficient models for real-world applications.

What you'll learn:

  • Using pre-trained YOLOv8 for immediate object detection
  • Fine-tuning on a custom dataset (pedestrian detection)
  • Evaluating and visualizing results

Why YOLOv8? Fast, accurate, easy to use, and excellent for deployment.

In [1]:
# Installation
!pip install ultralytics opencv-python matplotlib pillow
Collecting ultralytics
  Downloading ultralytics-8.3.213-py3-none-any.whl.metadata (37 kB)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.12/dist-packages (4.12.0.88)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.12/dist-packages (3.10.0)
Requirement already satisfied: pillow in /usr/local/lib/python3.12/dist-packages (11.3.0)
Requirement already satisfied: numpy>=1.23.0 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (2.0.2)
Requirement already satisfied: pyyaml>=5.3.1 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (6.0.3)
Requirement already satisfied: requests>=2.23.0 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (2.32.4)
Requirement already satisfied: scipy>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (1.16.2)
Requirement already satisfied: torch>=1.8.0 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (2.8.0+cu126)
Requirement already satisfied: torchvision>=0.9.0 in /usr/local/lib/python3.12/dist-packages (from ultralytics) (0.23.0+cu126)
Requirement already satisfied: psutil in /usr/local/lib/python3.12/dist-packages (from ultralytics) (5.9.5)
Requirement already satisfied: polars in /usr/local/lib/python3.12/dist-packages (from ultralytics) (1.25.2)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.17-py3-none-any.whl.metadata (14 kB)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.3.3)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (4.60.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.4.9)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (25.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (3.2.5)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib) (1.17.0)
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests>=2.23.0->ultralytics) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests>=2.23.0->ultralytics) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests>=2.23.0->ultralytics) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests>=2.23.0->ultralytics) (2025.10.5)
Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (3.20.0)
Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (4.15.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (75.2.0)
Requirement already satisfied: sympy>=1.13.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (1.13.3)
Requirement already satisfied: networkx in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (3.5)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (3.1.6)
Requirement already satisfied: fsspec in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (2025.3.0)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.77)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.77)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.6.80 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.80)
Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (9.10.2.21)
Requirement already satisfied: nvidia-cublas-cu12==12.6.4.1 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.4.1)
Requirement already satisfied: nvidia-cufft-cu12==11.3.0.4 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (11.3.0.4)
Requirement already satisfied: nvidia-curand-cu12==10.3.7.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (10.3.7.77)
Requirement already satisfied: nvidia-cusolver-cu12==11.7.1.2 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (11.7.1.2)
Requirement already satisfied: nvidia-cusparse-cu12==12.5.4.2 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.5.4.2)
Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (0.7.1)
Requirement already satisfied: nvidia-nccl-cu12==2.27.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (2.27.3)
Requirement already satisfied: nvidia-nvtx-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.77)
Requirement already satisfied: nvidia-nvjitlink-cu12==12.6.85 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (12.6.85)
Requirement already satisfied: nvidia-cufile-cu12==1.11.1.6 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (1.11.1.6)
Requirement already satisfied: triton==3.4.0 in /usr/local/lib/python3.12/dist-packages (from torch>=1.8.0->ultralytics) (3.4.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from sympy>=1.13.3->torch>=1.8.0->ultralytics) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from jinja2->torch>=1.8.0->ultralytics) (3.0.3)
Downloading ultralytics-8.3.213-py3-none-any.whl (1.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 49.6 MB/s eta 0:00:00
Downloading ultralytics_thop-2.0.17-py3-none-any.whl (28 kB)
Installing collected packages: ultralytics-thop, ultralytics
Successfully installed ultralytics-8.3.213 ultralytics-thop-2.0.17
In [2]:
from ultralytics import YOLO
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import urllib.request
import os

print('Environment ready!')
Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
Environment ready!

Part 1: Quick Start - Pre-trained Detection¶

Let's start by using YOLOv8 pre-trained on COCO dataset (80 common object classes).

In [3]:
# Load pre-trained YOLOv8
model = YOLO('yolov8n.pt')  # n = nano (fastest), also available: s, m, l, x

print('YOLOv8 model loaded!')
print(f'Model can detect {len(model.names)} classes')
print(f'Classes: {list(model.names.values())[:10]}...')  # Show first 10
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt': 100% ━━━━━━━━━━━━ 6.2MB 282.8MB/s 0.0s
YOLOv8 model loaded!
Model can detect 80 classes
Classes: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light']...
In [4]:
# Download sample images
sample_urls = [
    'http://images.cocodataset.org/val2017/000000039769.jpg',  # Cats
    'http://images.cocodataset.org/val2017/000000397133.jpg',  # Sports
    'http://images.cocodataset.org/val2017/000000037777.jpg',  # Traffic
]

os.makedirs('samples', exist_ok=True)
image_paths = []

for i, url in enumerate(sample_urls):
    try:
        path = f'samples/image_{i}.jpg'
        urllib.request.urlretrieve(url, path)
        image_paths.append(path)
        print(f'Downloaded image {i+1}')
    except:
        print(f'Failed to download image {i+1}')

print(f'\nReady to detect on {len(image_paths)} images!')
Downloaded image 1
Downloaded image 2
Downloaded image 3

Ready to detect on 3 images!
In [5]:
# Run detection on all images
results = model(image_paths, conf=0.5)  # conf = confidence threshold

# Visualize results
fig, axes = plt.subplots(len(results), 2, figsize=(16, 6*len(results)))
if len(results) == 1:
    axes = axes.reshape(1, -1)

for idx, (result, path) in enumerate(zip(results, image_paths)):
    # Original image
    img = cv2.imread(path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    axes[idx, 0].imshow(img)
    axes[idx, 0].set_title('Original', fontsize=14)
    axes[idx, 0].axis('off')

    # Detection result
    result_img = result.plot()  # YOLOv8 draws boxes automatically
    axes[idx, 1].imshow(result_img)
    axes[idx, 1].set_title(f'Detected: {len(result.boxes)} objects', fontsize=14)
    axes[idx, 1].axis('off')

    # Print detected objects
    print(f'\nImage {idx+1}:')
    for box in result.boxes:
        cls = int(box.cls[0])
        conf = float(box.conf[0])
        print(f'  {model.names[cls]}: {conf:.3f}')

plt.tight_layout()
plt.show()
0: 640x640 2 cats, 1 remote, 29.7ms
1: 640x640 1 person, 3 bowls, 1 oven, 29.7ms
2: 640x640 1 dining table, 1 oven, 1 refrigerator, 29.7ms
Speed: 10.5ms preprocess, 29.7ms inference, 125.0ms postprocess per image at shape (1, 3, 640, 640)

Image 1:
  cat: 0.868
  cat: 0.831
  remote: 0.830

Image 2:
  person: 0.894
  bowl: 0.747
  bowl: 0.745
  bowl: 0.725
  oven: 0.563

Image 3:
  refrigerator: 0.924
  oven: 0.897
  dining table: 0.690
No description has been provided for this image

Part 2: Fine-tuning on Custom Dataset¶

Now let's fine-tune YOLOv8 on a custom dataset for pedestrian detection using the Penn-Fudan dataset.

In [6]:
# Download Penn-Fudan dataset
!wget -q https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip
!unzip -q PennFudanPed.zip

print('Dataset downloaded!')
Dataset downloaded!
In [7]:
# Prepare dataset in YOLO format
import shutil
from pathlib import Path

# Create directory structure
dataset_root = Path('pedestrian_dataset')
for split in ['train', 'val']:
    (dataset_root / split / 'images').mkdir(parents=True, exist_ok=True)
    (dataset_root / split / 'labels').mkdir(parents=True, exist_ok=True)

# Convert masks to YOLO format bounding boxes
from PIL import Image
import numpy as np

def mask_to_bbox(mask_path):
    """Convert mask to YOLO format: class x_center y_center width height (normalized)"""
    mask = np.array(Image.open(mask_path))
    h, w = mask.shape

    bboxes = []
    obj_ids = np.unique(mask)[1:]  # Skip background

    for obj_id in obj_ids:
        pos = np.where(mask == obj_id)
        xmin, xmax = np.min(pos[1]), np.max(pos[1])
        ymin, ymax = np.min(pos[0]), np.max(pos[0])

        # Convert to YOLO format (normalized)
        x_center = ((xmin + xmax) / 2) / w
        y_center = ((ymin + ymax) / 2) / h
        width = (xmax - xmin) / w
        height = (ymax - ymin) / h

        bboxes.append(f'0 {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}')

    return bboxes

# Process all images
img_dir = Path('PennFudanPed/PNGImages')
mask_dir = Path('PennFudanPed/PedMasks')
images = sorted(list(img_dir.glob('*.png')))

# Split train/val (80/20)
split_idx = int(0.8 * len(images))
train_images = images[:split_idx]
val_images = images[split_idx:]

for split, img_list in [('train', train_images), ('val', val_images)]:
    for img_path in img_list:
        # Copy image
        shutil.copy(img_path, dataset_root / split / 'images' / img_path.name)

        # Create label file
        mask_path = mask_dir / img_path.name.replace('.png', '_mask.png')
        bboxes = mask_to_bbox(mask_path)

        label_path = dataset_root / split / 'labels' / img_path.name.replace('.png', '.txt')
        with open(label_path, 'w') as f:
            f.write('\n'.join(bboxes))

print(f'Dataset prepared:')
print(f'  Train: {len(train_images)} images')
print(f'  Val: {len(val_images)} images')
Dataset prepared:
  Train: 136 images
  Val: 34 images
In [8]:
# Create dataset config file
config = f"""
path: {dataset_root.absolute()}
train: train/images
val: val/images

nc: 1
names: ['pedestrian']
"""

with open('pedestrian.yaml', 'w') as f:
    f.write(config)

print('Config file created!')
Config file created!
In [9]:
# Visualize a sample from training data
sample_img = train_images[0]
sample_label = dataset_root / 'train' / 'labels' / sample_img.name.replace('.png', '.txt')

# Read image
img = cv2.imread(str(dataset_root / 'train' / 'images' / sample_img.name))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
h, w = img.shape[:2]

# Read and draw boxes
with open(sample_label) as f:
    for line in f:
        cls, x_c, y_c, width, height = map(float, line.strip().split())

        # Convert back to pixel coordinates
        x_c, y_c, width, height = x_c * w, y_c * h, width * w, height * h
        x1 = int(x_c - width/2)
        y1 = int(y_c - height/2)
        x2 = int(x_c + width/2)
        y2 = int(y_c + height/2)

        cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)

plt.figure(figsize=(10, 8))
plt.imshow(img)
plt.title('Sample Training Image with Annotations')
plt.axis('off')
plt.show()
No description has been provided for this image
In [10]:
# Fine-tune YOLOv8
model = YOLO('yolov8n.pt')  # Start from pre-trained weights

# Train
results = model.train(
    data='pedestrian.yaml',
    epochs=20,
    imgsz=640,
    batch=8,
    name='pedestrian_detector',
    patience=5,  # Early stopping
    save=True,
    verbose=True
)

print('Training complete!')
Ultralytics 8.3.213 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
engine/trainer: agnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=8, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=pedestrian.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=20, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=pedestrian_detector, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=5, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=/content/runs/detect/pedestrian_detector, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf': 100% ━━━━━━━━━━━━ 755.1KB 104.8MB/s 0.0s
Overriding model.yaml nc=80 with nc=1

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]              
  8                  -1  1    460288  ultralytics.nn.modules.block.C2f             [256, 256, 1, True]           
  9                  -1  1    164608  ultralytics.nn.modules.block.SPPF            [256, 256, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]                 
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  1     37248  ultralytics.nn.modules.block.C2f             [192, 64, 1]                  
 16                  -1  1     36992  ultralytics.nn.modules.conv.Conv             [64, 64, 3, 2]                
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  1    123648  ultralytics.nn.modules.block.C2f             [192, 128, 1]                 
 19                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]                 
 22        [15, 18, 21]  1    751507  ultralytics.nn.modules.head.Detect           [1, [64, 128, 256]]           
Model summary: 129 layers, 3,011,043 parameters, 3,011,027 gradients, 8.2 GFLOPs

Transferred 319/355 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks...
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt': 100% ━━━━━━━━━━━━ 5.4MB 337.6MB/s 0.0s
AMP: checks passed ✅
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3001.8±541.4 MB/s, size: 281.5 KB)
train: Scanning /content/pedestrian_dataset/train/labels... 136 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 136/136 478.2it/s 0.3s
train: New cache created: /content/pedestrian_dataset/train/labels.cache
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 2156.2±1363.6 MB/s, size: 240.6 KB)
val: Scanning /content/pedestrian_dataset/val/labels... 34 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 34/34 440.1it/s 0.1s
val: New cache created: /content/pedestrian_dataset/val/labels.cache
Plotting labels to /content/runs/detect/pedestrian_detector/labels.jpg... 
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: AdamW(lr=0.002, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to /content/runs/detect/pedestrian_detector
Starting training for 20 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       1/20      1.14G      1.017      2.003      1.182         40        640: 100% ━━━━━━━━━━━━ 17/17 4.9it/s 3.5s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 2.4it/s 1.2s
                   all         34         69          1      0.378      0.954      0.716

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       2/20      1.35G     0.8843      1.117      1.078         37        640: 100% ━━━━━━━━━━━━ 17/17 13.5it/s 1.3s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.2it/s 0.2s
                   all         34         69      0.941      0.681      0.807      0.599

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       3/20      1.36G     0.9591      1.138      1.111         38        640: 100% ━━━━━━━━━━━━ 17/17 14.6it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 17.3it/s 0.2s
                   all         34         69      0.937      0.623      0.741      0.526

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       4/20      1.38G     0.9491       1.08      1.095         55        640: 100% ━━━━━━━━━━━━ 17/17 15.1it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.6it/s 0.2s
                   all         34         69       0.94      0.551      0.748      0.506

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       5/20       1.4G     0.9086      1.077      1.106         49        640: 100% ━━━━━━━━━━━━ 17/17 14.4it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.1it/s 0.2s
                   all         34         69      0.968      0.884      0.952       0.72

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       6/20      1.41G     0.9041       1.05      1.103         35        640: 100% ━━━━━━━━━━━━ 17/17 14.1it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 17.3it/s 0.2s
                   all         34         69      0.934      0.822      0.951      0.676

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       7/20      1.43G     0.9277      1.029      1.093         49        640: 100% ━━━━━━━━━━━━ 17/17 14.6it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 17.3it/s 0.2s
                   all         34         69      0.836       0.87      0.906      0.621

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       8/20      1.45G      0.922     0.9991      1.096         40        640: 100% ━━━━━━━━━━━━ 17/17 15.1it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.3it/s 0.2s
                   all         34         69       0.81      0.863      0.908      0.697

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
       9/20      1.46G     0.9551     0.9937      1.135         48        640: 100% ━━━━━━━━━━━━ 17/17 14.4it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.7it/s 0.2s
                   all         34         69      0.966      0.828      0.951        0.7

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      10/20      1.48G      0.853     0.9065      1.068         47        640: 100% ━━━━━━━━━━━━ 17/17 14.4it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 17.4it/s 0.2s
                   all         34         69      0.928      0.941      0.975      0.747
Closing dataloader mosaic
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      11/20       1.5G     0.7373      1.136       1.03         24        640: 100% ━━━━━━━━━━━━ 17/17 9.8it/s 1.7s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.2it/s 0.2s
                   all         34         69          1      0.881      0.968      0.756

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      12/20      1.52G     0.6828     0.9378     0.9684         17        640: 100% ━━━━━━━━━━━━ 17/17 14.3it/s 1.2s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.2it/s 0.2s
                   all         34         69      0.971      0.959       0.99      0.775

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      13/20      1.53G     0.6926     0.9309     0.9717         26        640: 100% ━━━━━━━━━━━━ 17/17 15.6it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.0it/s 0.2s
                   all         34         69       0.97      0.986       0.99      0.742

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      14/20      1.55G      0.614     0.8391     0.9366         25        640: 100% ━━━━━━━━━━━━ 17/17 15.2it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 15.8it/s 0.2s
                   all         34         69      0.985      0.958      0.988      0.765

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      15/20      1.57G     0.6125     0.8295      0.931         17        640: 100% ━━━━━━━━━━━━ 17/17 15.4it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.4it/s 0.2s
                   all         34         69      0.998      0.957      0.989      0.803

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      16/20      1.59G     0.6445     0.8344     0.9759         14        640: 100% ━━━━━━━━━━━━ 17/17 15.6it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.4it/s 0.2s
                   all         34         69      0.971      0.985      0.985      0.793

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      17/20      1.61G     0.6331     0.8131     0.9616         19        640: 100% ━━━━━━━━━━━━ 17/17 15.2it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.1it/s 0.2s
                   all         34         69      0.985      0.954      0.985      0.792

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      18/20      1.62G     0.5942     0.7888     0.9376         22        640: 100% ━━━━━━━━━━━━ 17/17 15.4it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.7it/s 0.2s
                   all         34         69      0.968      0.957      0.989      0.811

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      19/20      1.64G     0.5801     0.7577     0.9168         16        640: 100% ━━━━━━━━━━━━ 17/17 15.1it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 17.4it/s 0.2s
                   all         34         69      0.984      0.957      0.993      0.828

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      20/20      1.66G     0.5697     0.7572     0.9371         29        640: 100% ━━━━━━━━━━━━ 17/17 15.1it/s 1.1s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 16.1it/s 0.2s
                   all         34         69          1      0.952      0.993      0.819

20 epochs completed in 0.010 hours.
Optimizer stripped from /content/runs/detect/pedestrian_detector/weights/last.pt, 6.2MB
Optimizer stripped from /content/runs/detect/pedestrian_detector/weights/best.pt, 6.2MB

Validating /content/runs/detect/pedestrian_detector/weights/best.pt...
Ultralytics 8.3.213 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,005,843 parameters, 0 gradients, 8.1 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 14.5it/s 0.2s
                   all         34         69      0.984      0.957      0.993      0.828
Speed: 0.1ms preprocess, 0.9ms inference, 0.0ms loss, 1.7ms postprocess per image
Results saved to /content/runs/detect/pedestrian_detector
Training complete!

Part 3: Evaluation and Results¶

In [11]:
# Load best model
model = YOLO('runs/detect/pedestrian_detector/weights/best.pt')

# Evaluate on validation set
metrics = model.val(data='pedestrian.yaml')

print('\nValidation Metrics:')
print(f'  mAP50: {metrics.box.map50:.3f}')
print(f'  mAP50-95: {metrics.box.map:.3f}')
print(f'  Precision: {metrics.box.mp:.3f}')
print(f'  Recall: {metrics.box.mr:.3f}')
Ultralytics 8.3.213 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,005,843 parameters, 0 gradients, 8.1 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3944.2±859.8 MB/s, size: 346.1 KB)
val: Scanning /content/pedestrian_dataset/val/labels.cache... 34 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 34/34 41.9Kit/s 0.0s
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 1.9it/s 1.6s
                   all         34         69      0.984      0.957      0.993      0.825
Speed: 1.8ms preprocess, 20.4ms inference, 0.0ms loss, 4.3ms postprocess per image
Results saved to /content/runs/detect/val

Validation Metrics:
  mAP50: 0.993
  mAP50-95: 0.825
  Precision: 0.984
  Recall: 0.957
In [12]:
# Test on validation images
test_images = list((dataset_root / 'val' / 'images').glob('*.png'))[:6]

fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.flat

for ax, img_path in zip(axes, test_images):
    # Run detection
    result = model(str(img_path), conf=0.5)[0]

    # Visualize
    result_img = result.plot()
    ax.imshow(result_img)
    ax.set_title(f'Detected: {len(result.boxes)} pedestrians', fontsize=12)
    ax.axis('off')

plt.tight_layout()
plt.show()
image 1/1 /content/pedestrian_dataset/val/images/PennPed00082.png: 544x640 1 pedestrian, 109.0ms
Speed: 2.4ms preprocess, 109.0ms inference, 9.7ms postprocess per image at shape (1, 3, 544, 640)

image 1/1 /content/pedestrian_dataset/val/images/PennPed00079.png: 608x640 1 pedestrian, 65.6ms
Speed: 2.3ms preprocess, 65.6ms inference, 1.4ms postprocess per image at shape (1, 3, 608, 640)

image 1/1 /content/pedestrian_dataset/val/images/PennPed00077.png: 608x640 1 pedestrian, 6.8ms
Speed: 2.2ms preprocess, 6.8ms inference, 1.3ms postprocess per image at shape (1, 3, 608, 640)

image 1/1 /content/pedestrian_dataset/val/images/PennPed00074.png: 512x640 2 pedestrians, 63.1ms
Speed: 1.9ms preprocess, 63.1ms inference, 1.4ms postprocess per image at shape (1, 3, 512, 640)

image 1/1 /content/pedestrian_dataset/val/images/PennPed00093.png: 544x640 1 pedestrian, 8.2ms
Speed: 2.3ms preprocess, 8.2ms inference, 1.3ms postprocess per image at shape (1, 3, 544, 640)

image 1/1 /content/pedestrian_dataset/val/images/PennPed00065.png: 640x640 1 pedestrian, 7.4ms
Speed: 2.5ms preprocess, 7.4ms inference, 1.3ms postprocess per image at shape (1, 3, 640, 640)
No description has been provided for this image
In [13]:
# Visualize training curves
from IPython.display import Image as IPImage, display

print('Training Results:')
display(IPImage('runs/detect/pedestrian_detector/results.png'))
Training Results:
No description has been provided for this image